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A Holevo measure is used to discuss how much information about a given POVM on system a is 
present in another system b, and how this influences tlie presence or absence of information about 
a different POVM on a in a tliird system c. The main goal is to extend information theorems 
for mutually unbiased bases or general bases to arbitrary POVMs, and especially to generalize 
"all-or-nothing" theorems about information located in tripartite systems to the case of partial 
information, in the form of quantitative inequalities. Some of the inequalities can be viewed as 
entropic uncertainty relations that apply in the presence of quantum side information, as in recent 
work by Berta et al. [Nature Physics 6, 659 (2010)]. All of the results also apply to quantum channels: 
e.g., if £ accurately transmits certain POVMs, the complementary channel will necessarily be noisy 
for certain other POVMs. While the inequalities are valid for mixed states of tripartite systems, 
restricting to pure states leads to the basis-invariance of the difference between the information 
about a contained in b and c. 
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I. INTRODUCTION 

A significant part of current quantum information re- 
search can be understood as an attempt to find answers 
to the following question: How much of what kind of in- 
formation about what is located where? In this paper 
we provide specific answers to these questions in the case 
of a general tripartite quantum system: subsystems a, b, 
and c are described by some sort of quantum state (pre- 
probability) that induces a joint probability distribution 
on different properties of these systems. Appropriate sta- 
tistical correlations can then be thought of in terms of, 
for example, system b containing information of some sort 
about certain physical properties of system a. To discuss 
how much information of this kind is contained in or can 
be found in b requires some sort of quantitative measure, 
and it is natural to look for something resembling the 
well-known Shannon measures in classical information; 
see [l| for a modern introduction to this subject. 

Although it is rather natural to treat systems a, b and 
c on an equal footing — and that is the perspective of this 
paper — one can also think of the properties as existing at 
different times. For example, a might be the entrance to 
a quantum channel with b, possibly but not necessarily 
the same physical system, the output of the channel and 
c the "environment" at this later point in time. Such 
a dynamical perspective is well-known in classical infor- 
mation theory as it applies to a noisy channel, where 
it can be discussed using the same information mea- 
sures, e.g., the mutual information H{X :¥), that ap- 
ply to statistically-correlated systems (think of a shared 
key used for cryptographic purposes) at the same time, a 
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static perspective. Both perspectives are also possible for 
problems in quantum information theory, though this has 
not received as much attention as we think it deserves, 
and viewing expressions which are formally the same (or 
closely related) from distinct points of view can make a 
valuable contribution to one's intuitive understanding of 
a situation. 

Of course, quantum information theory is more gen- 
eral than classical information theory, so the conceptual 
ideas provided by the latter are insufficient for discussing 
the quantum world. In this paper we take the perspective 
that a valuable way to think about the quantum case is to 
distinguish different types or species of quantum informa- 
tion [2] . For example, if a is a single qubit the distinction 
between |0) and |1) constitutes the "z" type of informa- 
tion, whereas the distinction between |-|-) and |— ), with 
|±) = (|0) ± |1))/V2, is the "x type." Each type by it- 
self, even when it refers to microscopic (thus "quantum" ) 
properties, follows the usual rules of classical information 
theory. This allows one to immediately transfer a large 
body of mathematical formalism and associated physical 
intuition from the classical to the quantum domain with- 
out risk of falling prey to inconsistencies and paradoxes. 
The quantum nature of the microscopic world then man- 
ifests itself through the fact that incompatible types of 
information, corresponding to non-commuting projective 
decompositions of the identity, cannot be combined: this 
is the single framework rule (see, e.g., Ch. 16 of Q) that 
allows a fully-consistent use of probabilities in the quan- 
tum domain.^ 



^ It is important to note that a typo of information as defined 
here refers primarily to a microscopic quantum property rather 
than the outcome of a measurement. A correctly constructed 
measurement apparatus can reveal the property of a microscopic 
system, so that, for example a Stern-Gerlach apparatus followed 
by detectors can determine if the spin-half particle entering the 
apparatus had Sz = +fl/2 or —h/2, corresponding to the qubit 
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In this paper we generalize the notion of a type of quan- 
tum information so that it includes not only a projective 
decomposition of the identity, a set of projectors that 
sum to the identity, but also a general POVM, a collec- 
tion of positive operators that sum to the identity. The 
idea, discussed in Sec. Ill Ai is that while the operators in 
a POVM are in general not orthogonal, each corresponds 
to a projector on a larger Hilbert space, the Naimark ex- 
tension (which is not unique), and the collection of such 
projectors sums to the identity on the larger space thus 
constituting a particular type of quantum information in 
the sense previously discussed. 

The question "How much?" has motivated an ongo- 
ing search for measures that extend the very useful idea 
of entanglement beyond bipartite pure states where it 
was first introduced. Despite a great deal of effort and a 
large number of intriguing results Q , it seems fair to say 
that there remain a large number of unanswered ques- 
tions even for bipartite mixed states, not to mention the 
multipartite case. It is not obvious that a single number 
representing the entanglement, or even a small collection 
of numbers, will suffice to embody the physical insights 
needed for a better understanding of such systems. In 
this paper we introduce measures of information that de- 
pend explicitly on the (quantum) type of information one 
is considering, so we can address the question of, for ex- 
ample, how well a noisy quantum channel performs for 
different types of input. The definitions and a detailed 
discussion of these measures will be found in Sec. IIIIl at 
this point it suffices to note that they are of the Holevo 
form using the quantum von Neumann entropy, though 
in some cases they can be generalized using other types 
of entropy. 

As well as direct quantitative measures of informa- 
tion certain differences in information measures, e.g., the 
amount of information of a given type that is in b minus 
how much is in c, are of interest. We refer to these as 
entropy or information biases. It is not without interest 
that the coherent information Q when expressed in the 
language of tripartite systems is (or least can be thought 
of as) such a bias; see Sec. lIII Cl In Sec.[lV]we show that 
under appropriate circumstances an information bias will 
be independent of the type of information under consid- 
eration. 

One of the most striking features of quantum infor- 



states |0) or |1), and in this case the z information initially pos- 
sessed by the particle is translated into distinct macroscopic ap- 
paratus states, making the z information "visible" or "classical." 
(For an important application to quantum information theory 
of the idea that a macroscopic quantum outcome reveals a prior 
microscopic state, see for a detailed discussion of the mea- 
surement process in fully quantum terms, see Chs. 17 and 18 
of However, the concept of z information can also be used 
in situations, such as when a qubit is just entering a quantum 
channel, where trying to relate it to a measurement, at least as 
a physical process occurring at that point in time, is not very 
helpful. 



mation is that if information of a particular type cor- 
responding to some orthonormal basis w of system a is 
perfectly present (perfect correlation, no noise) in system 
b for the quantum state under discussion, this prevents 
or excludes a type of information v corresponding to a 
basis mutually unbiased (MU) with respect to w — that 
is, V and w are mutually-unbiased bases (MUBs) — from 
being present in a third system c. In Sec. |V] of this paper 
we present quantitative generalizations of this and some 
other "all-or-nothing" theorems to situations in which, 
for example, almost all information of the w type of in- 
formation about a is in & and one wants to bound how 
much V information, where v is only approximately MU 
with respect to w, can be present in c. 

In particular, Theorem[5]in Sec. IV Bl presents a bound 
of this form. It extends to POVMs an important in- 
equality proved in Q, earlier conjectured in Q, using 
a somewhat simpler proof. This extension was also re- 
cently proven in Q using smooth entropies; in contrast 
our proof approach is based on the relative entropy. Var- 
ious consequences, including the application to a channel 
and its complementary channel, are worked out in var- 
ious corollaries. As well as thinking of this result as a 
bound on the amounts of two strongly incompatible (in 
the sense of almost MUB) types of information about 
a present in different locations. Theorem [5] constitutes 
a generalized entropic uncertainty relation for system a 
when the coupling to another system or systems is taken 
into account ( "quantum side information" in the sense 
discussed in [3, (To] ) . 

Several additional quantitative generalizations of all- 
or-nothing results are given in Sees. IV AI IV CI and IV Dl 
The all-or-nothing results can be succinctly stated as fol- 
lows for orthonormal bases u, v, and w of a, where u and 
V are MU relative to w (but not necessarily to each other) . 
If the w type of information is perfectly present in 6, then 
(1) pac is block diagonal in the w basis (Lemma [4|), (2) 
the amount of u information in b is equal to the amount 
of V information in b (Thcorcm[5]), (3) if the v information 
is perfectly present in b then there is a perfect quantum 
channel from a to 6 (Theorem [T0|) . (4) if the w infor- 
mation is completely absent from c, then no information 
about a is in c: the two are decoupled (Theorem [11]). 

The remainder of this paper is organized as follows. 
Section H] is an introduction to tripartite systems, in- 
cluding the connection with quantum channels and their 
complements, and provides details of what we mean by 
different types of quantum information. Various quanti- 
tative measures of information are introduced, and some 
of their properties discussed, in Sec. IIIII Our main re- 
sults, which, as indicated above, provide quantitative 
bounds on the location of various types of information 
in different systems, occupy Sees. IIVI andlVl Section IVll 
relates our work to various other approaches and pub- 
lications. A summary, which provides an overview of 
how the different theorems are related to each other, is 
in Sec. IVII AI followed by an indication of issues worth 
further exploration in Sec. IVII Bl To make the main pre- 
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sentation compact and easier to follow, all but the very 
shortest proofs have been relegated to appendices. 



II. SYSTEMS WITH THREE PARTS 
A. POVMs and types of information 

Much work in contemporary quantum information the- 
ory is devoted to particular instances of what may be 
called the tripartite system problem defined in the follow- 
ing way. Let Habc ~ "Ha ® "Hb <8) He be a tensor product 
of Hilbert spaces of dimensions da, db, dc, all assumed to 
be finite, and let 



la — ^ Paj, h — ^ Qbk, Ic — ^ Rc 



(1) 



be three POVMs, decompositions of their respective iden- 
tities into finite sets of positive operators, hereafter re- 
ferred to as Pa, etc.^ [Note that we use the symbols a, 
b, and c as subscripts (but occasionally on line) to label 
subsystems, and indices j, k, I, etc. to label the POVM 
elements.] What can be said about the joint probability 
distribution 



Pr{Paj,Qbk,Rcl) = ^T:{Pa]QbkRclPabc), 



(2) 



where pabc is a density operator acting as a prc- 
probability (generator of probabilities in the terminology 
of Sec. 9.4 of 3), perhaps but not necessarily a projec- 
tor on the pure state |f2)? In particular, what is 
its information-theoretic significance? One is, of course, 
interested in how these probabilities, and the correspond- 
ing marginal distributions such as 

Pr(Faj, Qbfc) = ^ PliPaj,Qbk, Rcl) = ^T^abiPajQbkPab), 
I 

(3) 

with pab the partial trace over "He of pabc, depend upon 
the indices j, k, and I. But of equal, or even greater inter- 
est is their dependence upon the choice of POVMs in ([1]). 
Here quantum theory, in contrast to classical physics, al- 
lows an enormous number of possibilities. 

In what follows we shall want to distinguish various dif- 
ferent types of POVM. A rank-1 POVM is one in which 
all the positive operators are of rank 1, which is to say 
proportional to projectors on one-dimensional spaces; we 
will employ symbols L, M, N to denote such POVMs. 
When all the POVM elements are projectors (orthogonal 
projection operators) we have a projective decomposition 



^ It is sometimes helpful to imagine the three parts as residing 
in three different places, say three different laboratories where 
Alice, Bob, and Carol can carry out separate preparations and 
measurements on them. 



of the identity. A rank-1 projective decomposition is as- 
sociated with an orthonormal basis; e.g., the orthonormal 
basis w = {\wj)} of Ha gives rise to the decomposition 



P, 



(4) 



In what follows we use the lower case letters u, v, and 
w to denote orthonormal bases, and where useful add a 
subscript, e.g., Wa, to indicate the corresponding system 
or Hilbert space. A second basis v = {\vj)} is mutually 
unbiased (MU) relative to w — the terms complementary 
or conjugate are also in use — thus v and w are mutu- 



when 



{Vj\Wk) 



l/Vda 



ally unbiased bases (MUBs) 
independent of j and k. 

Unlike a general POVM, a projective decomposition 
can be given a simple physical interpretation: the pro- 
jectors, or the subspaces onto which they project, form 
a quantum sample space: a collection of mutually ex- 
clusive physical properties one and only one of which is 
true; see Ch. 5 of [j]. In previous work [2[ such a pro- 
jective decomposition was called a type of information: 
e.g.. Ha = {Haj} is a type of information about the sys- 
tem a. Two types of information H^ and ^a about the 
same system are compatible provided every projector in 
one set commutes with every projector in the other set: 
Haj^ak ~ ^ak^aj for cvcry j and fc; otherwise they are 
incompatible. Two distinct rank-1 projective decomposi- 
tions, or the corresponding orthonormal bases, are nec- 
essarily incompatible if they differ by more than simply 
relabeling the projectors, and two MUBs are incompat- 
ible to the maximum extent possible. Probabilistic ar- 
guments in quantum mechanics cannot combine results 
from incompatible decompositions — the single framework 
rule, see Ch. 16 of 0] — without risk of generating contra- 
dictions and paradoxes. 

However, in the present paper we generalize the notion 
of a type of information about (say) system a to include 
any POVM Pa when interpreted using a Naimark exten- 
sion; see [ill, 12\ or Sec. 9-6 of [l^. Assume that the 
Hilbert space Ha is a subspace of a larger Hilbert space 
Ha, with Ea the operator on Ha that projects onto Ha- 
lf Ha has been appropriately chosen there is a projective 
decomposition {H^^} of its identity I a such that 



Paj — EallAjEa- 



(5) 



In addition, one can always arrange that for each j the 
rank of IIaj is the same as the rank of Paj , though one 
may need an additional projector, call it Hao, which is 
orthogonal to Ea, so the corresponding P^o is the zero 
operator. (It is possible to set things up so that the rank 
of IlAj exceeds that of Paj, but in light of ([5]) the re- 
verse is impossible.) An important special case used in 
proving later results is that any rank-1 POVM on a is 
equivalent to some rank-1 projective decomposition (or- 
thonormal basis) on A [l^l . One can if one wishes think of 
Ha as a tensor product Ha® He, where He is the Hilbert 
space of some reference system, and Ha is itself (isomor- 
phic to) the subspace of kets of the form ® |eo), with 
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|eo) a fixed, normalized ket in T-L^. In this case the density 
operator on Ais pA = Pae = pa^\Go){^o\, and Ea in ^ is 
simply la ® |eo)(eo|. Starting with the projective decom- 
position Ha for the larger system A, one can think of the 
corresponding positive operators defined in ^ as conve- 
nient mathematical tools for computing probabilities in 
those cases in which the density operator pA has support 
in the subspacc Ha onto which Ea projects. From this 
perspective, and using the corresponding Naimark exten- 
sions for b and c, one could reformulate the results given 
in later sections of this paper in terms of projective de- 
compositions on the larger Hilbert spaces. However, the 
use of POVMs provides in many cases a simpler mathe- 
matical form, and a theorem that refers to an arbitrary 
POVM obviously includes projective decompositions as 
particular cases. Nonetheless, when thinking in physical 
or operational terms about the type of information rep- 
resented by a general POVM Pa it is helpful to employ 
its Naimark counterpart. 

The information about a POVM Pa = {Paj} is said to 
be completely or perfectly present in system b provided 
the conditional density operators 

PjPbj = TTa{PajPab)\ Pj P^'iPaj) = Tr{PajPa), (6) 

on Hb are mutually orthogonal: pbjPtj' ~ for j ^ j' . 
Conversely, this type of information is (completely) ab- 
sent from b when the conditional density operators pbj 
are identical. One can visualize this in terms of measure- 
ments as follows. Suppose a POVM {Paj} measurement 
is carried out on system a. Can the value of j be deduced 
by carrying out an appropriate sort of measurement on 
system 6? If the pbj are orthogonal to each other this 
is clearly possible using (projective) measurements cor- 
responding to a suitable decomposition {Qbk} of But 
in the other extreme in which the pbj are identical it 
is clear that no measurement on b will provide any in- 
formation about j. It is worth noting that one obtains 
the same values in ^ by replacing the formulas with 
PjPbj = TrA(nAj PAb) and Pj = Tt{IIajPa), so all of our 
measures in Sec. IIII 51 quantifving the presence of the Pa 
information in 6, depending only on {pj, pbj}, will be un- 
affected by replacing Pa with its Naimark extension 11^. 



B. Quantum channels 

In some sense the most natural way to state the var- 
ious results given below in Sees. IIVI and |V] is in terms 
of correlations in which all three parts a, b, and c are 
treated, at least formally, in a symmetrical fashion. But 
some of the more interesting applications are to quan- 
tum channels and complementary channels, in which the 
channel entrance is not treated in the same way, either 
formally or intuitively, as the channel output. Hence in 
order to facilitate application of our results to the case of 
channels, we provide a brief explanation, using ideas in 
[3 mi : of why the "tripartite" and the "channel" prob- 
lem are not only closely related to each other, but in some 



sense identical problems in the case where one restricts 
attention to a pure-state pre-probability \Q) € T-Labc- 




}\n) 



FIG. 1: How |f2) is produced by applying the isometry V to 
an entangled state 1$). 



Consider the situation shown in Fig. [T] where 

\n) = iia®v)\<i>). (7) 

is the result of applying an isometry 

V = J2\s,){a'^\ (8) 



to the a' part of an entangled state | $) S Ha ® Ha' , with 
Ha' a copy (i.e., the same dimension) of Ha- Here {la^}} 
is some orthonormal basis of Ha' held fixed during the 
following discussion, we are assuming that da ^ dbdc, 
and the requirement that V be an isometry, which is to 
say V^V ~ la is equivalent to the assumption that the 
kets {|sj)} form an orthonormal collection spanning the 
subspacc Hs — VHa' of Hbc- 

If, in particular, |$) is the fully-entangled state 

i<i>) = (i/Vrf:)EK)®K-)' (9) 



then 



\n)^il/Vd'a)J2\a,)®\s,), (10) 



is an example of what in |14l | is called a channel ket, 
characterized by the property that 



Pa ^Tncm{n\)= la/da. 



(11) 



Indeed, given a pre-probability jfi) such that pT|) holds, 
it is necessarily a fully-entangled state on Ha <E) Hbc, so it 
will have a Schmidt form pO|) for {|aj)} a given orthonor- 
mal basis of Ha, and using the orthonormal collection of 
states {|sj)} corresponding to this Schmidt decomposi- 
tion one can define a corresponding isometry V by means 
of ((HI). Thus by employing map-state duality (see e.g. 
[i| or Ch. 11 of [3) one can move from a channel ket 
\n) satisfying (fTTj) to an isometry V or the reverse. 

From an information-theoretic perspective the isome- 
try V corresponds to saying that all information about 
the system a is in the system be, and in fact in the sub- 
space Hs of Hbc spanned by the \sj) in (fTUl) . The partial 
traces onto Hb and He in a sense "project down" parts of 
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this information onto these subsystems. Thus, not sur- 
prisingly, the projector T onto "H,, along with its partial 
traces down to Hb and He, 

j 

Tb = Tre(T), Te = Trb(T), (12) 

play useful roles in our thinking about these problems, as 
they in a sense describe, in a basis-independent way, how 
the subspace Hs is "oriented" relative to the factor spaces 
Hb and Jic- Note that while T is a projector, and T^, 
are positive operators but (in general) not projectors. 

The isometry in ([5|) can be used to define a quantum 
channel from a to & through the superoperator 

£{A) = Tr^iVAVl) = ^lAKj (13) 
I 

that maps the space Ci^a) of operators on V-a to the 
corresponding space C{Hb) of operators on Hb- Here the 
Kraus operators are maps from T-La to Hb of the form 

Ki^{ci\V^Y.^ci\s,){a,l (14) 
j 

where {|q)} is an ortlionormal basis of He, and {ci\sj) is 
a ket inT-Lb, defined in an obvious way, not just a complex 
number. Because V is an isometry the Kraus operators 
satisfy the usual closure condition 

Y,K}Ki=h. (15) 

The complementary channel from a to c, 

HA) = TniVAV^) - ^nALl, (16) 

is defined in a similar way with Kraus operators 

i™ = (fom|"t^ = ^(6rn|s,)(aj|, (17) 

3 

for {|fo,ri)} some orthonormal basis oiHb, and these again 
satisfy the closure condition analogous to ([15]). 

The superoperators £ and and their adjoints f ^ 
and H relative to the usual Frobenius inner product 
(P, Q) — Tr(ptQ), can be expressed directly in terms 
of Pabc = or its partial traces such as pab, using 

formulas such as 

£{A) = daTla[{A'^ (E) Ib)pab], 
[SHB)^ = daTTbiila ® B)pab], (18) 

where denotes the transpose relative to the basis {|aj)} 
employed in ([9]) and . The complete positivity of £ 
is equivalent to the requirement that pab be a positive 
operator; in some respects this is simpler and more com- 
pact than the traditional definition. For it to be trace 



preserving it is necessary that pa be la /da, (fTTj) . Since 
in general neither pb nor pc is proportional to the cor- 
responding identity, the adjoints £t and H arc not (in 
general) trace preserving, and in this sense are not quan- 
tum channels. This is one respect in which "tripartite" 
language is more flexible than "channel" language. 

It is also worth observing that the superoperator £ 
uniquely determines \Q) up to local unitaries on Jia and 
He for a fixed dc- This is because a set of Kraus op- 
erators is generated, using an orthonormal basis 
{|c/)} of He, and one can invert the process by writing 
V = J2i where of course the result depends on the 
choice of basis {|q)}. Different orthonormal bases on He, 
as is well-known, simply give rise to different collections 
of Kraus operators which represent the same quantum 
channel or operation. In this sense a channel completely 
determines its complementary channel for a fixed de, and 
vice versa, up to local unitaries. However, different in- 
sights may emerge by considering one rather than the 
other, or by thinking about the two together. 

We say there exists a perfect quantum cfiannel from a 
to b when all types of information about a are perfectly 
present in b. This by itself implies that pa = la /da (see 
Theorem 3 in [13), and thus £ in ([T5|) is trace- preserving. 
It obviously suffices to check that the information asso- 
ciated with every orthonormal basis is present in b, but 
there are also weaker conditions that ensure the presence 
of a perfect quantum channel; e.g. see 0, [l^ and the 
discussion in Sec. IV Dl 

A more general relationship is possible between an 
isometry V and a tripartite pure state, by starting with 
the circuit in Fig.[l] but assuming that |$), while no 
longer fully entangled, has full Schmidt rank: 

fe 

with TTfe > for every k. With V an isometry of the form 
dH]), j replaced by k, and 

Pa = Y'^k\ak){ak\ (20) 

k 

the partial trace of |$)($| down to Ha, one has 

pb = Traem{n\) =£{pa), (21) 

where £ is the superoperator corresponding to V through 
(fT3|) . A similar result holds for the complementary a to c 
channel. The ket \n) determines the projector T = VV''' 
uniquely, but V itself only up to a unitary transformation 
on Ha- Conversely, two isometrics V and V giving rise 
to the same T can be used to generate the same by 
using two different entangled states |$) and |$). 

The partially entangled |$) is useful when ad- 

dressing the following question: Suppose an ensemble 
{Pj,Pj} of states is sent through the quantum channel 
£. How can one relate the outputs £{pj) of the channel 
to corresponding outcomes of a POVM measurement Pa 
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on the tripartite state |rj)? Suppose the density operator 
Pa = TlijPjPj ensemble is of the form ([20|) . i.e., 

choose 1$) in (fT^ such that this is the case. Then define 
Pa through 

Pl^P,Wp,W\ (22) 
where denotes the transpose in the basis {|afe)}, and 

W^ = E(VV^)|afe)(4l- (23) 

fc 

It is straightforward to show that Paj is a positive oper- 
ator with the same rank as pj (since W is nonsingular) , 
and Paj = la- The probability of outcome j for the 
POVM is Pj, and the corresponding conditional density 
operator is 

Pbc3 - T^raiPajPabc) = V p,V\ (24) 

with pabc = |^^)(^^|- Tracing this down to h yields £{pj), 
the outcome when pj is sent through the channel. 

The preceding discussion requires some fairly obvious 
changes if some of the tt^ in p9|) are zero. First, T = VV^ 
is not uniquely determined by since the |sfc) in ^ 
corresponding to zero tt^ are unknown. Second, the sum 
in (|23|) must be restricted to the k with tt^ > 0, whereas 
remains the same. 

III. INFORMATION MEASURES 

A. Entropies 

All the information measures that we will introduce are 
based on some sort of entropy. In classical information 
theory [l| the usual starting point is the Shannon entropy 

H{P) = H{{p,}) = -Y.P, \ogp„ (25) 

where P denotes a random variable or its correspond- 
ing probability distribution. Given two random vari- 
ables P and Q the entropy H{P, Q) is obtained by re- 
placing Pj in (j25[) by the joint probability distribution 
Pjk = ^'^{Pj^Qk) and summing over both j and fc. The 
conditional entropy and mutual information arc then de- 
fined by: 

H{P\Q)^H{P,Q)-H{Q), 
H{P:Q)=H{P) + H{Q)-H{P,Q). (26) 

The quantum entropy most closely analogous to Shan- 
non's H is the von Neumann entropy 

5(p) = -Tr(plogp), (27) 

but we have also studied some other possibilities: 

^fl(p) = Y^logTr(p''), Q<q^\, 

5t(p) = Y^W)-1], 0<g^oo, 

5Q(p) = l-Tr(p2). (28) 



Here Sr, St, and Sq are the Renyi, Tsallis, and 
quadratic (often misleadingly called linear) entropies, re- 
spectively. Some of our results are valid for all these 
entropies, in which case they are stated for Sk, where K 
denotes either no subscript (von Neumann) or else one of 
the three symbols R,T,Q. 

All of these entropies are strictly concave, 
SxiY^PoPj) ^ J2Pj^KiPj) for < Pj < I and 
^Pj ~ 1, with equality if and only if all pj's arc equal, 
provided the parameter q in the case of Sn and St is 
in range specified in Both S^ and St are equal 

to S in the limit q = I, and St interpolates between S 
and Sq as q goes from 1 to 2.^^ The entropies S, Sq, 
and St ioT q ^ 1, are subadditive [Ol in the sense that 
SxiPa) + SxiPb) ^ SxiPab), but Only the von Neumann 
S has the property of strong subadditivity on a tripartite 
system (p. 519 of 0): 

S{pab) + S{pbc) ^ S{pabc) + S{pb). (29) 

Given a bipartite quantum system with a density op- 
erator pab, partial traces pa and pb, the quantum condi- 
tional entropy and the quantum mutual information are 
defined as (p. 514 of 0) 

S{a\b) = S{pab) - S{pb), 
S{a:h)^S{pa) + S{pb)-S{pab), (30) 

which are formally analogous to the quantities in ([26]). 
Note that S{a\h) can be negative. On the other hand, 
S{a : b) is nonnegative and vanishes for a product state 
Pab = Pa® Pb, and thus can be regarded in some sense 
as a measure of how much information about a is in 6 or 
vice versa. Thought of in this way it has the property 
that for a tripartite system abc, 

S{a:bc)^ S{a:b), (31) 

i.e., there is less information about a in 6, a subsystem 
of be, than there is in be, which seems a reasonable re- 
quirement for a measure of information. Note that ()3ip 
is equivalent to (j29p . a property not shared (in general) 
by the other entropies defined in ([28]). 

We shall later prove our main result using the relative 
entropy: 

S{p\\a) = Tr(plogp) - Tr(ploga), (32) 

which has the useful property [3l that it is non- 
increasing under the action of a quantum channel £, 

S{p\\a) ^ S{£ipma)). (33) 



^ For this remark (that St = S for g = 1) to be true, one should 
use base e for the log appearing in S; however, we note that all 
other remarks and results in this article are valid for arbitrary 
base of the log. 
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The extension of ([32)1 to general positive operators is 
natural, and [l9j for any positive operators A, B, and C, 
if C ^ S (i.e. C — B is a positive operator), 

SiA\\B)^SiA\\C). (34) 

B. Distinguishability measures 

While S{a : b) can serve as an overall indication of how 
much information about a is in b, or vice versa, it is 
not a measure that depends on the type of information, 
so cannot be used to compare how well different types 
of information about a arc found in, or transmitted to 
b. For this purpose one could use a fidelity measure: 
how closely a state on T-Lb resembles its counterpart on 
Ha- However, this requires making some identification 
between the two Hilbert spaces, which is not easy to do 
if they are of different dimension, or else one needs an 
additional map or channel to carry Jib back to Jia- For 
this and other reasons we prefer to use a distinguishability 
measure. Thus suppose Pa = {Paj} is a decomposition 
of the identity la oiHai ©j ^nd {pj, pbj} is the ensemble 
of conditional states on Jib defined in ^ . Two extreme 
cases were discussed in Sec. Ill Al that in which the Pa 
type of information is perfectly present in 6, which means 
PbjPbk = for j k, thus conditional density operators 
perfectly distinguishable; and the Pa type of information 
(completely) absent from b, meaning the pbj are identi- 
cal for all j and thus indistinguishable. Our goal is to 
assign numerical values to situations lying between these 
extremes. 

Ideally one might use some collection of numbers refer- 
ring to the distinguishability of every pair of conditional 
density operators pbj , see [20| for an overview of distin- 
guishability measures for two density operators. How- 
ever, we shall employ a much coarser but still useful char- 
acterization in which a single number, in some sense an 
"average" distinguishability, is assigned to each informa- 
tion type, thereby allowing us to focus on our primary 
goal: elucidating how the amount of information depends 
upon the type considered, for a given pre-probability 
(density operator or channel). As is customary in infor- 
mation theory we want a measure that is nonnegative, 
that is (formally) invariant under local unitary opera- 
tions, and, naturally, we prefer simple mathematical ex- 
pressions that have a clear intuitive interpretation. This 
still leaves many possibilities, but among them we have 
found that measures based on the Holevo function 

XK{{Pj,Pj}) = S'a'(^PjPj) - '^PjSKiPj) (35) 
j j 

are particularly useful, where {pj,Pj} denotes an ensem- 
ble associated with a particular Hilbert space H: each pj 
a density operator on this space, and the {pj} a probabil- 
ity distribution. Here Sk could be any of the entropies 
defined in (P7)) or (pS)) : S without a subscript refers to 
the von Neumann entropy, and the corresponding x has 



no subscript. Because each of these entropies is a strictly 
concave function (for q in the appropriate range indicated 
in (HH])), xk is nonnegative and equal to zero if and only 
if the Pj are identical. 

When ((35|) is applied to the ensemble {pj,Pbj} of ([6]), 
states in TLb conditional on the decomposition Pa = 
{Paj} in ©! the result is 

XK{Pa,b) := SxiPb) - ^PjSKipbj), (36) 
j 

a measure of the amount of information of type Pa in b. 
This is also a numerical measure of what is sometimes 
called quantum side information [lo| . 

While Pa can refer to a general projective decompo- 
sition of la or a POVM, we will often be interested in 
an orthonormal basis projectors \wj){wj\, oiJia, 

in which case we will write xk{w, &), omitting the a sub- 
script when it is obvious from the context. One can easily 
show using the concavity of Sk that 

XK{Pa,b);^XK{Pa,b), (37) 

where Pa and Pa are POVMs, and Pa is a coarse-graining 
of Pa formed by summing some of the Paj elements. Also, 
as a consequence of ([29]) . see [2l| . 

xiPa,bc)^x{Pa,b), (38) 

so a subsystem b of be cannot contain more information 
than be itself. (This does not hold for xk with K = R, 
T or Q.) 

In the case of a quantum channel £ (|13p from a to 6 
associated with isometry V from a to be, we define 

XK{Pa,£) 5K[£(^PjPaj)]-^p,5K[f(Pa,)], (39) 

where Pa is a POVM, la = J2 Paj = da Y^PjPaj, with 

Paj^Paj/TriPa,), p, ^ Tv{Pa,) / da- (40) 

Note that SiEPjPaj) = TT^{VV^)/da = Tt/da [see CH)] 
in the first term of ([39]) is independent of the POVM 
Pa- Equation ([39|) is some measure for how well £ pre- 
serves the distinguishability of the Pa ensemble; e.g. 
if £ perfectly preserves the orthogonality of an input 
orthonormal basis w then x('Wj^) = log da? otherwise 
xi'w,£) < log da (see Lemma [1] below) . 
In contrast to x{Pa, b), the quantity [l^ 

H{Pa\b):=H{Pa)^xiPa,b) (41) 

is a measure of absence of the Pa type of information from 
b, where H{Pa) is the Shannon entropy ([25]) associated 
with the probabilities defined in ([5]).* One can also think 



* Following 01, we use H for classical entropy and S for quantum 
entropy. For conditional entropy, we use H if the first argument 
is classical as in l|41|l , and S if the first argument is more general 
(quantum) as in I I30I I. 
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of H{Pa\b) as the missing information about Pa given the 
quantum system &, and it is a natural quantum analog 
of H{Pa\Qb) = H{Pa) - H{Pa : Qb) [see where one 

identifies x{Pa,b) as a quantum analog of H{Pa:Qb)-^ 
In contrast to S{a\h) defined in ([30|) . H{Pa\b) is non- 
negative (see Lemma [T|); it equals the Shannon missing 
information H{Pa) in the case when h provides no infor- 
mation about Pa, and it equals zero only when h perfectly 
contains the Pa information. 

We remark that an alternative way of defining H{Pa\h), 
similar to that employed in is to introduce the 

quantum channel £p from ah — > eb defined by 

£p{.Pab) = l ® T^'^aiPajPab), (42) 

j 

where {|ej)} is an orthonormal basis for an auxiliary sys- 
tem e. Then H{Pa\b) is the von Neumann conditional 
entropy S{e\b) of the state £p{pab)- 
Lemma 1. This lemma summarizes some useful proper- 
ties of the x{Pa,b) and H{Pa\b) measures. 

(i) For any ensemble {pj,Pj} 

x{{Pj,Pj}) ^ S{J2PjPj) - X^Pj'^'^^j) ^ H{{pj}), 
j j 

(43) 

with equality if and only if the pj are mutually orthogo- 
nal. 

(ii) Let Pa and Qb be any two POVMs on a and b 
respectively, and pab any state on Hab- Then 

H{Pa:Qb)^x{Pa,b) s; 

inm{S{pa),S{pb),Sia:b)}, (44) 

and hence by (gT]), (gj)- and (gl) . 

O^HiPa\b) ^HiPa\Qb). (45) 

□ 

Part (i) is from ^ (Theorem 11.10, p. 518). The left- 
hand-side of (|44|) is Holevo's bound (p. 531 of Q), and 
the right-hand-side of (|44p is similar to Proposition 1 of 
[2^ though we prove it in Appendix since we have 
explicitly inserted the bound on x- 



C. Entropy biases and coherent information 

In addition to quantitative measures of information 
about one system present in another it is useful to have 
measures of information differences. In what follows we 



^ It is straightforward to show that x{Pa,b) becomes H(Pa - Qb) 
if one replaces the conditional density operators pi,j in I I36II with 
conditional probability distributions Pr(Qi,| Pa = Paj), and also 
replaces S{) with H{). 



shall make use of two quantities of this type. When con- 
sidering two systems b and c, 

ASK{b,c):=SK{Pb)'Sj,{p,), (46) 

is the entropy bias, while for information type Pa, 

AxKiPa;b,c) ■.^XK{Pa,b)~XK{Pa,c) (47) 

is the information bias. Analogous quantities for the 
complementary channels £ and T (to b and c respec- 
tively) arising from isomctry V arc: 

ASk{£,T) ■■= S^iTb/da) - S^O^Jda), 
AxK{Pa;£,T) ■.^XK{Pa,£)-XK{Pa,:F). (48) 

Unlike our information measures these quantities can 
(obviously) be negative. When using the von Neumann 
entropy we omit the subscript K and denote these quan- 
tities, e.g., by A5(6, c) and Ax{Pa',b,c). 

The coherent information I^oh (Sec. 12.4.2 of [^) is a 
particular instance of the entropy bias for the tripartite 
pure state |r2): 

Icoh{Pa',£)^AS{b,c) (49) 

where, see the discussion in Sec. Ill Bl associated with (|19p . 
the quantum channel £ corresponds to an isomctry V 
which yields when applied to an entangled state |<I>) 
chosen so that the partial trace of down to a' 

yields the density operator pa' ■ The density operators pb 
and pc needed to define the entropy bias, (|46|) . on the 
right side of arc the partial traces of down 
to systems b and c, respectively. It can also be seen 
more directly, for the maximally-mixed input state, that 

IcoY.iIa'/da',£)=AS{£,T). 

Despite the connection in (|^. the entropy bias in ([iS]) 
seems more natural in the state or static point of view, 
which lacks the notion of inputs and outputs, than /coh- 
The latter has always been thought of as a function of 
a trace-preserving superoperator £ and an input state 
Pa' to a channel, whereas the biases in (|46ll and ([47|) 
are simply functions of the tripartite state without 
making reference to how it may have been generated by 
the combination of an isomctry and a partially-entangled 
state. 

IV. BASIS INVARIANCE 

We begin our discussion of how the amount of informa- 
tion about system a in some other system(s) depends on 
the type of information with two cases in which certain 
quantities are actually independent of type. In both of 
them, a pure-state prc-probability is assumed. 
Theorem 2. Consider a bipartite system with a pure- 
state pre-probability pab = |^)(^|- Let be a rank-1 
POVM on a, let w be an orthonormal basis (thus also a 
rank-1 POVM) on a, then 

XKiw,b)^XK{N,b) = SK{Pa) (50) 

is independent of the basis w or rank-1 POVM A^. 
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Proof. Apply ((36|) to w, setting SxiPb) = SxiPa) and 
the second term in p6|) to zero because each pi,j is a 
pure state, proving XK{u>,b) ~ SxiPa)- From Sec. Ill Al 
N is equivalent to an orthonormal basis Vae on Jia He 
assuming the state on ae is pae = Pa'^ Igq) (eo|, where |eo) 
is some pure state on e. Thus, XK{N,b) = XK{vae,b) — 
SK{Pae), but SK(Pae) = Sxipa) for all cutropy functions 
under consideration. □ 

This implies that if the w information about a is absent 
from b, XKiw^b) — 0, all types are absent and |^') is a 
product state, which is one form of the Absence theorem 
of 0. And it generalizes in that if the w information is 
almost absent from b, then by ([37]) x(^i b) ^ xiP^ b), any 
other type P is almost absent from b. On the other hand, 
one can read ([50]) as a statement that all (rank-1) types 
of information are equally present; the only problem is 
interpreting the common value of xk(ui, b) = SxiPa)- In 
the case of the von Neumann entropy, x(w, b) = S{pa) is 
the usual entanglement measure of j^"), and is an upper 
bound on the Shannon mutual information (Lemma [T]) 
that can be achieved by performing measurements in the 
Schmidt bases on a and b. Note that reading ((50|) in 
reverse provides a natural interpretation for SK{Pa)] it 
is the amount of information about any rank-1 type N 
contained in a system b that purifies pa, as measured by 
XK{N,b). 

The following useful result for tripartite pure states 
and complementary channels (see Sec. IVI Ap is proved in 
Appendix [B] 

Theorem 3. Let M and N be rank-1 POVMs on a, 
and let v and w be orthonormal bases (thus also rank-1 
POVMs) on a. 

(i) Consider a tripartite system with a pure-state pre- 
probability pabc = Then the information bias 
defined in ([77]) . 

^Xk{w; b, c) = Axk(A^; fo, c) = A^k(6, c), (51) 

where K denotes any of the entropies defined in ()27p or 
(|28p . is equal to the corresponding entropy bias, and thus 
independent of the choice of orthonormal basis or rank-1 
POVM. It follows that the difference: 

Xk{M, b) - xk{N, b) = xk{M, c) - xk{N, c), (52) 

is the same for 6 and c, which obviously holds if M and 
N are replaced by v and w. 

(ii) Likewise, for complementary quantum channels £ 
and J- , the information bias defined in (|48p . 

/^Xk{w;£,F) = 1^xk[N-£,F) = A5K(f ,-F) (53) 

is invariant to the choice of orthonormal basis w or rank-1 
POVM N, and 

Xk{M,£) - xk{N,£) = xk{M,F) - xk{N,T). (54) 



This theorem provides a natural interpretation for the 
entropy bias of a tripartite pure state: this is the amount 
by which more (or less if the bias is negative) w informa- 
tion about a is present in b than it is in c. The theorem 
tells us that this excess, which we call the information 
bias, does not depend upon the orthonormal basis w, 
allowing us to drop the w from Axi4-(6,c) under these 
conditions. This theorem is used in proving several of 
the results that follow, including Theorems [8l [TOl andfTT] 

Example 1. As an illustration, suppose that in the case 
of a qubit, da = 2, the z information associated with the 
standard |0), |1) basis is perfectly transmitted from a to 
6, while no information in the conjugate x basis is trans- 
mitted; i.e., we have a perfect "classical" channel from 
a to b. Setting M = z and = a; in (|52p and using 
Lemma [U H{z) ~ x(z,c) — x(2;,c), which can only be 
true if xix,c) = and H{z) = xi^jc). The z informa- 
tion is thus perfectly transmitted from a to c, saying the 
"classical" information (in this sense) is always copied 
to another party, and further by the basis-invariance of 
Ax{b, c) = 0, that the ab and the ac channels are equally 
effective in terms of the x measure.^ This conclusion 
can be reached by alternative lines of argument, but it 
illustrates the nontrivial content of Theorem [31 



V. GENERALIZING ALL-OR-NOTHING 
THEOREMS 

In this section we consider various quantitative gen- 
eralizations, using the information measures introduced 
in Sec. Illlj of some "all-or-nothing" theorems (H, which 
have the general form that in a multipartite system if a 
particular type or types of information about a particular 
subsystem a is perfectly present or absent in some other 
subsystem, then some other types of information about 
a will also be perfectly present or absent in other loca- 
tions. In each subsection below we provide a quantitative 
generalization of such a theorem to situations of partial 
presence or absence, indicating the connection with the 
all-or-nothing theorem if it is not already clear. 



A. Truncation 

The Truncation theorem of [2l states that ff n = {H^} 
is a projective decomposition of la, and if the 11 type of 
information about a is perfectly present in c, then for any 
third system 6, the density operator pab is truncated or 
block-diagonal (or "pinched", p. 50 of [2^) in the sense 
that Pab = ^j^jPab^j- The following result is a gen- 
eralization of this theorem to the case of partial infor- 
mation presence in c, and also allows for more general 



□ 



An explicit example of this is the GHZ state (|000) + |111))/V2. 
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POVMs P in Part (ii). The all-or-nothing result conies 
out by setting H{Il\c) = (perfect information presence) 
in ([55]) below, which implies that pab = J2j ^jPab^-j since 
S{p\\(t) = only if p = tr. More generally, pab will be 
"close" (in the relative entropy sense) to the truncated 
form if _ff (n|c) is small. 

Lemma 4. Let 11 = {Hj} be a projective decomposition 
of la and let P = {Pj} be any POVM on a. 

(i) Let pabc be a pure state, then 

H{U\c) = Sipab\\J2U,pabUj). (55) 

j 

(ii) Let Pabc be any state, then 

H{P\c) ^ S{pab\\J2PiP-bP])- (56) 



□ 

The proof can be found in Appendix O This lemma is 
also used in proving the uncertainty relation in the next 
section. 



B. Information exclusion relations 

An exclusion relation refers to incompatible types of 
information such that the presence of one type in one 
subsystem "hinders" or to some extent "excludes" the 
incompatible type from being present in a different sub- 
system. Thus the Exclusion theorem of asserts that 
if V and w are mutually unbiased bases on a, and the v 
information about a is perfectly present in b, then the 
w information about a is (completely) absent from c. A 
quantitative extension of this to partial presence and ab- 
sence can be based on the following theorem, where the 
incompatibility of two POVMs P = {Pj} and Q = {Qk} 
is quantified using: 



where in this case (l57l) reads 



r(P, Q) := max 

j.k 



(57) 



Here || • ||oo denotes the suprcmum norm: the maximum 
singular value of the operator. 

Our main result, with proof in Appendix [D| is: 
Theorem 5. Let pabc be any state on Habc- 

(i) Let P = {Pj} and Q = {Qk} be any two POVMs 
on Ha, with H{-\-) defined in (gT]) and r in Then 

HiP\b) + H{Q\c) ^ log[l/r(P, g)], (58) 

where each H{-\-) term is bounded by, e.g.: 



H{P\h) ^ log[l/vH?VP)] 



(59) 



(ii) Specializing (|58p to the case of orthonormal bases 
V — {\vj){vjW and w ~ {\wk){wkW; we obtain: 



r{v, w) ^ ma.x\{vj\wk)\'' 



(61) 



(iii) The right-hand-side of ([SO]) is largest when v and 
w are MUBs, r{v,w) = 1/da- 



H{v\b) + H{w\c) > log da- 



(62) 



□ 

We remark that (|60p is equivalent to the main inequal- 
ity conjectured in Q and proven in ML see Sec. IVIBl 
and (|58p was also recently proven in [9| using smooth 
entropies, an approach different from ours. Our proof 
approach is based on the relative entropy; we will go into 
more detail about this approach in a subsequent article 

a. 

It is useful to view the inequalities in TheoremlHlin two 
different ways, as information exclusion relations and as 
entropic uncertainty relations. The fact that they contain 
both principles can be seen, for example, in the MUB case 
by rewriting ([62]) as: 



(63) 



H{v) + H{w) ^ x{v,b) +x{w,c) +\ogda. 



Viewed from the left-hand-side it looks like an entropic 
uncertainty relation: a lower bound on an entropic sum. 
Viewed from the right-hand-side it looks like an informa- 
tion exclusion relation: an upper bound on an informa- 
tion sum. We note here that setting H{v\b) = in ([62]) 
implies H^wlc) = log da, the maximum value, and thus 
c contains no information about w, demonstrating that 
our result implies (and thus generalizes) the Exclusion 
theorem from 0]. 

As (j6Q|) was proven in Q , consider the following exam- 
ple illustrating how ([58|) goes beyond ((60|) . 
Example 2. Set Q to the w basis, and let P be a POVM 
composed of n pure states or rank-1 operators each with 
trace da/n and each of which is unbiased with respect to 
the w basis. Applying ([58|) gives 



(64) 



H{P\b) + H{w\c) ^ logn. 



H{v\b) + H{w\c) ^ log[l/r(w, w)], 



(60) 



Now suppose c contains all the w information, H(w\c) — 
0. This implies that H{P\b) — logn, which in turn im- 
plies two conditions, the probabilities of the Pj are equal, 
so there is maximal missing information about which Pj 
state system a is in, and the P information must be per- 
fectly absent from b: x(-Pj&) = 0. The latter means that 
all states in P get mapped by ([6]) to the same output 
density operator pbj on b. For example, for da — 2 con- 
sider setting w to the z basis (standard basis); then P 
could be the four states making up the x and y bases 
or three states forming an equilateral triangle in the xy 
plane of the Bloch sphere or any symmetric set of states 
in the xy plane. Imagining P to be composed of a very 
large number of states in the xy plane, by continuity all 
states in the xy plane must get mapped to the same out- 
put density operator pbj on b when H{z\c) = 0; a result 
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that docs not come out of pairing z with a particular 
MUB, say a;, and using (|60l) . This all-or-nothing result is 
implied by the Truncation theorem of but (fM)) also 
describes the partial information case, saying that the 
Phj associated with P must be fairly indistinguishable if 
if (wjc) is small. 

Inspired by (and strengthening) a result in [25j . Eq. 
(|59p is, in some sense, an uncertainty relation for a single 
POVM. Rewriting it as 

HiP) ^ x{P, h) + log[l/ VK^], (65) 

it strengthens the bound H{P) ^ x{P^ b) from Lemma[TJ 
stating that if the P measurement outcome is fairly cer- 
tain \H{P) small], this can partially exclude the P infor- 
mation from another system h [x{P, b) small]. The idea is 
that a POVM is generally not a set of mutually-exclusive 
properties (Sec. Ill A|l so it has some intrinsic incompat- 
ibility, as measured by log[l/-\/r(P, P)]. For example, 
if P is composed of n rank-1 operators each with trace 
da/n, then log[l/^r(P, P)] = log(n/da). 

Some information exclusion relations below for quan- 
tum channels are proven in Appendix [E] Although they 
follow from Theorem [51 they bring to mind a slightly 
different picture [l^, as one imagines Alice sending "in- 
compatible ensembles" P and Q respectively through £ 
and and if the J- channel transmits the Q ensemble 
well to Carol, then the £ channel must be constructed 
in such a way that at its output Bob will have difficultly 
discerning which member of the P ensemble Alice sends. 
Corollary 6. For complementary quantum channels £ 
and X given by ([39]) , 

(i) Let P and Q be any two POVMs, with H{P) = 
H{{pj}) where pj is given by (|40|) and likewise for H{Q), 

xiP,£) ^ H{P) - log[l/Vr(P,P)], (66) 

X(P, £)+x{Q,^)^ HiP) + H{Q)- log[l/r (P, Q)] . 

(67) 

(ii) For orthonormal bases v and w, 

X{v,£) + x{w,r) ^\og[dlr{v,w)]. (68) 

(iii) For MUBs v and w, 

x{v,£)+x{w,T)i^ log da. (69) 

□ 

As another corollary to Theorem [5l some uncertainty 
relations for a single system [2^, HB] (see Sec. IVI B|l can 
be strengthened for mixed states, with the proof in Ap- 
pendix IfI 
Corollary 7. 

(i) For any state p, let be a rank-1 POVM and let 
P be any POVM, then 

H{N) ^ log[l/v/r(iV,iV)] + 5(p), (70) 
H{N) + H{P) ^ log[l/r(iV, P)] + S{p). (71) 



(ii) For any state p of a qubit (dimension d = 2) and 
any complete set of three MUBs x, y, and z: 

H{x)+H{y)+H{z)^2\og2 + S{p). (72) 

□ 

While one might conjecture that ([70)) or ([7T|) gener- 
alizes to the case where N is an arbitrary POVM, it is 
easy to see that this is false. Imagine a highly mixed state 
such that S{p) is very large, yet N and P are composed of 
coarse-grained projectors with very high rank, so H{N) 
and H{P) would be small, violating the inequality. 

Note that ([7^ is a tight bound, achieved for example 
when the state is along the z-axis of the Bloch sphere, 
such that H{x) = H{y) = log 2 and H{z) = S{pa). 

C. Suppression of differences 

The following is a bipartite result, proved in Ap- 
pendix [Gl saying that the presence of some type of in- 
formation P about a in 6 suppresses the difference in the 
presence of two other types of information, M and N, 
about a in b. Note that a similar result holds for quan- 
tum channels. 

Theorem 8. Let pat be any state. 

(i) For any POVM P on a; rank-1 POVMs M and N 
on a, 

\xiM,b)-xiN,b)\^HiP\b)+ 

max{H{M) - log[l / r {P, M)], H{N) - log[l/r(P, iV)]}, 
\H{M\b) - H{N\b)\ H{P\b) + 

m£ix{H{M) - \og[l/r{P,N)],H{N) - log[l/r(P, M)]}. 

(73) 

(ii) For orthonormal bases u, v, w on a, with u and v 
each MU with respect to w. 

\xiu,b)-xiv,b)\^H{w\b), 

\H{u\b) - H{v\b)\ s$ H{w\b). (74) 

(iii) Let u, V, w be as in (ii), but in addition assume 
that the w type is perfectly present in b, then 

Xk{u, b) = xk{v, b) = Sxipb) - SxiPab), 

H{u\b) = H{v\b) = log da + S{a\b), (75) 

meaning that all types MU to w are present to the same 
degree in 6, in this sense. □ 

The difference suppression effect is most apparent in 
part (ii) of this theorem, where ([74]) says that the pres- 
ence in b of the w information forces types MU to the 
w type to be equally present in 6, in the sense of having 
the same x s-nd H quantities. As an illustration, consider 
da = 2, let the z information about a be perfectly present 
in b, then all types in the xy plane of the Bloch sphere 
are present in b to the same degree, bringing to mind the 
image of a prolate spheroid (American football), with z 
being the major axis, for the information about a present 
in b. 
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D. Decoupling theorems 

The preceding results can be used to generalize of 
some all-or-nothing decoupling theorems, which provide 
sufficient conditions about the information content of b 
and/or c to ensure that c is completely uncorrelated to 
or decoupled from a. For example, the No Splitting the- 
orem of [4] states that if all types of information about 
a are perfectly present in b, then all types of information 
about a are perfectly absent from c. (The name is moti- 
vated by the idea that a perfect quantum channel from 
a to 6 allows no diversion or split off of information to 
a third location c.) In our notation this corresponds to 
the assertion that when H{w\b) = for every orthonor- 
mal basis w of "Ho, then x(^^: c) = for every such basis. 
What follows is a quantitative generalization, a corollary 
of Theorem [HI with the No Splitting theorem the special 
case when a = 0. 

Corollary 9. Let a be some positive constant. 

(i) For any state Patc, if H(w\b) ^ a for every orthonor- 
mal basis w of Jia, then xi''^^ c) a for every such basis. 

(ii) For complementary quantum channels £ and J-, if 
x(w,£) ^ log da — a for every orthonormal basis if of the 
channel input, then x{w,J-) ^ a for every such basis. 

Proof. For any orthonormal basis w there is a MU basis v, 
and thus (|62|) implies that H{w\c) ^ log da — a and hence, 
because H{w) cannot exceed log da, x(it;, c) cannot be 
greater than a. The channel version follows by the same 
argument using ([69| . □ 

The Presence theorem of (H states that if two strongly 
incompatible types of information about a are perfectly 
present in 6, then all information about a is perfectly 
present in b, which is to say there is a perfect quantum 
channel from a to b. Unfortunately, "strongly incom- 
patible" is a complicated concept, and it is not obvious 
how to extend it to a quantitative measure in the general 
case. Instead, wc consider two POVMs N and P, and 
the case where they are MUBs implies they are strongly 
incompatible types of information. The following the- 
orem, proved in Appendix |H1 combines the notions of 
"presence" and "no splitting" , and gradually specializes 
from POVMs to orthonormal bases to MUBs. Note that 
part (ii) of this theorem is stated for channels to remind 
the reader that each of our results for states has some 
analogous formulation for channels.'' 
Theorem 10. For any POVM P on a; rank-1 POVMs 
M and N on a; orthonormal bases u, v, w on a; 

(i) For any bipartite state pab 

H{N\b) + H{P\b) > log[l/r(iV, P)] + Sia\b), (76) 



When one obtains an upper bound on a x quantity for a channel 
as in Theorems \E\ O IIUI and 1111 this bound also holds if one 
composes any channel Q with i.e. x(^i6 ° ^ xi^y-^) by 
H38|l. which is useful if one is interested in bounding information 
in a subsystem of the output of 



where, for any tripartite state patc, 

X(M, b) ^ [-S{a\b)], and H{M\c) ^ [-S{a\b)]. (77) 

(ii) For complementary quantum channels £ and J^, 

X{u,£) > Ax(f ,^) > x{v.£)+x{w,£)-\og[dlr{v,w)l 

(78) 

x{u,F) ^ \og[dlr{v,w)] - [x{v,£)+x{'u^.£)]- (79) 

(iii) For MUBs v and w, 

S{a:b) ^ 2 log da - 2[H{v\b) + H{w\b)l (80) 
S{a:c) H{v\b) + H{w\b). (81) 

□ 

Corollary [9] gave a condition to guarantee that no in- 
formation is present in c, and it is that all information is 
present in b. But what part (iii) of Theorem [10] shows is 
that one need not check that every single type of informa- 
tion is present in 6; rather, simply check that b contains 
two types that are MUBs and this will completely decou- 
ple c from a Q . Part (ii) emphasizes that the information 
in c (transmitted by T) can be upper bounded and that 
in b (transmitted by £) lower bounded even when the two 
bases are not MUBs. Part (i) generalizes this notion fur- 
ther to POVMs. By ^ one can lower-bound [-S{a\b)], 
some measure of the entanglement between a and b, just 
by knowing that b contains information about a rank- 
1 POVM on a and an arbitrary POVM on a. By ((77)) 
this serves to lower-bound both the M information in b 
and the M information missing from c, for any rank- 
1 POVM M. The application of such a relation, spe- 
cialized to orthonormal bases, to quantum cryptography 
was discussed previously in [7[ , and the generalization to 
POVMs might turn out to be useful. 

There is a seemingly odd restriction in (|76p that either 
N or P must be composed of rank-1 elements. One might 
conjecture that (|76|) holds for arbitrary POVMs, but this 
is false. One can see this by choosing pab = Pa ® Pb in 
which case ([76]) reduces to ((71|) . As discussed following 
Corollarylll (fTTjl could be violated dramatically if S{pa) 
was large but both and P were composed of high-rank 
projectors. 

The following decoupling theorem considers the situa- 
tion where some type of information w is both perfectly 
present in b and absent from c. It shows that this sim- 
ple condition strikingly is enough to completely decouple 
c from a, and furthermore, for pure states, it leads to 
the suppression of differences between all types of infor- 
mation in b. The theorem gradually specializes from all 
states to pure states to channels, with the proof in Ap- 
pendix |T1 

Theorem 11. Let L, M, N be rank-1 POVMs and P be 
any POVM on a; let w and w be orthonormal bases on a, 
(i) Let pabc be any state, then 

S{a : c) ^ x{N, c) + H{N\b) - log[l/^riN, N)]. (82) 
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If the N type of information is perfectly present in &, then 

Xk{P,c) ^Xk{N,c). (83) 

If, in addition, the N type of information is absent from 
c, then all types of information about a are absent from 
c, i.e. a and c are completely uncorrelated: pac = Pa®Pc- 

(ii) In the special case of pure states pabc = |^^)(^^|7 

\x{L,b)~x{M,h)\ ^ x{N.c)+H{N\b)-\og[ll^r{N,N)] 

and thus, in the extreme case where the TV type of infor- 
mation is perfectly present in b and absent from c, 

x{LM^x{M,b)^S{pa) (85) 

is independent of rank-1 POVM (or orthonormal basis). 

(iii) For complementary channels £ and J- , 

x{v,J^) < x{w,J^) + [log da - x{w,£)], 
x{v,£)^x{w,£)-x{w,:F). (86) 

Thus, if the w type of information is perfectly present in 
the £ channel and absent from the channel, the same 
is true for all types of information. This is a necessary 
and sufficient condition for £ being a perfect quantum 
channel and IF being a completely noisy channel. □ 

VI. CONNECTION WITH OTHER WORK 

A. Difference of Holevo quantities 

Schumacher and Westmoreland [l^] remarked that a 
difference in x quantities associated with sending an en- 
semble of pure states through complementary quantum 
channels depends only on the average density operator 
of the input ensemble. This situation is equivalent to 
the one considered in ([51]) of Theorem [3l where a rank-1 
POVM N acts on system a of a tripartite pure state 
The equivalence follows from the discussion in Sec. IIIBI 
decompose |0) into an isometry V acting on half of a 
bipartite pure state |<I>), as in ([7|) and Fig. [TJ then by 
the construction in (|22p . any pure-state ensemble at the 
input of V can be produced by an appropriate choice of 
N and |$). Despite this equivalence, the notion of basis 
invariance or invariance to the rank-1 POVM N emerges 
naturally out of the state view, since the average den- 
sity operator of the input ensemble to V is unaffected 
by choice of N . If one is willing to restrict to inputting 
the maximally-mixed average density operator, then the 
basis-invariance emerges in the channel view as well, as 
in 

B. Entropic uncertainty relations 

Our inequalities are related to several entropic uncer- 
tainty relations in the literature (see [28| for a recent 



review), which are translated below into our notation. 
Maassen and Uffink [2^ proved an entropic uncertainty 
relation for measurements in orthonormal bases v and w 
on system a for any state pa'- 

H{v) + H{w) ^ log[l/r(v, w)]. (87) 

Krishna and Parthasarathy [2^ generalized this to 
POVMs P and Q, 

H{P)+H{Q)^\og[llr{P,Q)l (88) 

and also stated an uncertainty relation for a single POVM 

H{P) ^ iog[i/v/K7vpy]. (89) 

Hall d^l incorporated into ((87|) the idea of "classical" 
side information, i.e. information about the outcome of a 
POVM Xe acting on a system e that may be correlated 
to a: 

H{v) + H{w) ^ log[l/r(t;, w)] + H{v : X^) + H{w : X^). 

(90) 

Considering e to be a composite system be and Xe = 
Qb® RcS. composite POVM, it follows from that: 

H{v\Qb) + H{w\R,) ^ log[l/r(w, w)], (91) 

see the discussion in @ where (|9T|) was termed the weak 
complementary information tradeoff and was ascribed to 
Cerf et aL [Ml. 

The inequalities in Theorem [3 ([581), dMl), and ^ 
respectively strengthen (f88|) . (fM]) . and ([87]) by allowing 
for quantum side information, for example, information 
about property P contained in another quantum sys- 
tem b, as measured by x(P, 6). The presence of such 
X quantities, reducing the left-hand-sides of the Theo- 
rem [5] inequalities, is precisely what strengthens these 
bounds. Equation ^ follows from dHZ]) [and thus ((60)) ] 
by an argument that can be found in j^, [s^l. Equa- 
tion (|91|) follows from ([60]) using the Holevo bound (|45p. 
H{v\Qb) ^ H{v\b) and H{w\Rc) > H{w\c). 

Equation (|60|) is precisely the "strong complementary 
information tradeoff" conjectured by Renes and Boileau 
and later proven by Berta et al. 0. It is straight- 
forward to show that our definition of H{v\b) in ([1T|) is 
equivalent the definition employed in [8| and [7[, see (j42p . 

The main inequality in Berta et al. [7|, 

H{v\b) + H{w\b) ^ log[l/r(t;, w)] + S{a\b) (92) 

was formulated for orthonormal bases v and w, and we 
generalized it to POVMs (with at least one POVM be- 
ing rank-1) in Also, is equivalent to (|60l) 
as follows. Apply ([60]) to a pure state pabc and use 
H{w\b) = H{w\c) - Ax{b,c) with S{a\b) = -Ax{b,c) 
to get (|92|) . Conversely, starting from ([92]) , follow the re- 
verse process to prove ((SD|) for pure states Pabc, and then 
([50)) for mixed states follows from ([551) . Thus, since ([50]) 
is generalized to two arbitrary POVMs by ([55)) . ([55)) and 
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([76]) provide two alternative generalizations of (|92p . To 
prove (|92| . Berta et al. first proved an uncertainty rela- 
tion involving smooth minimum and maximum entropies, 
and then invoked a lemma that these entropies approach 
the desired von Neumann entropic quantities under an 
appropriate asymptotic limit. In contrast, our proof does 
not use smooth entropies, but invokes the monotonicity 
of the relative entropy under quantum operations, so the 
approaches are conceptually different. 

Christandl and Winter [16| derived an information ex- 
clusion relation for quantum channels, which can be re- 
arranged and expressed in our notation to read: 

x(a;,£)+x(2,-F) ^logda, (93) 

where x and z are orthonormal bases related to each other 
by the d-dimensional quantum Fourier transform, and 
£ and T are complementary quantum channels. Equa- 
tion ((69)) of Corollary [6] generalizes this to arbitrary 
MUBs, and (|67|) further generalizes to input ensembles 
associated with POVMs. 

Our results strengthen some uncertainty relations in 
the case of mixed states. In the special case where is a 
rank-1 POVM, ([701) and dnD respectively strengthen ((M)) 
and (1881) with the addition of the S{pa) term. Sanchez- 
Ruiz [261 proved an entropic uncertainty relation for sets 
of do -I- 1 MUBs, which when applied to qubits {da = 2) 
gives: 

H {x) + H{y) + H{z) ^ 2\og2. (94) 

Likewise this is strengthened for mixed states by ([7^ . 
Bounds depending on the purity of pa were also given in 
[26| ; in the qubit case these bounds are implied by (|72|) . 

C. No Splitting and Decoupling 

Kretschmann et al. [s^l have studied the degree to 
which a channel is error-correctable using a diamond- 
norm measure, and showed that when a channel is nearly 
perfect (in this sense) its complementary channel trans- 
mits very little information, and vice versa. Bcny and 
Orcshkov [33j formulated a similar theorem for com- 
plementary channels, but in a general, symmetric fash- 
ion, using a fidelity measure. Hayden and Winter jsj 
have studied the degree to which a channel preserves 
the distinguishability of input states, and formulated the 
tradeoff in geometry-preservation between complemen- 
tary channels using a trace-distance measure. Each of 
these formulations generalize the No Splitting principle 
(sec Sec. IV D|) . although their information measures are 
of a different nature from the one we employ, and the 
connection between our approach and theirs remains to 
be determined. Intuitively, the No Splitting theorem 
should also be related to the notion that entanglement is 
monogamous. Quantitative expressions of entanglement 
monogamy have been found in terms of the concurrence 
and the squashed entanglement Q; as these are "global" 



measures of correlation, their relation to our information- 
type-specific measure is not obvious. 

Renes and Boileau [1] formulated a decoupling theo- 
rem as a corollary to their conjectured uncertainty rela- 
tion [Eq. (|5ni) ]. stating that if b contains the information 
about two sufficiently incompatible orthonormal bases of 
a, then the coupling of c to a can be upper-bounded. 
This is quite similar to our Theorem 1101 which extends 
this notion to two sufficiently incompatible POVMs. 



VII. CONCLUSIONS 
A. Summary 

Since our technical results in Sees. IIVI and |V] involve 
a large number of theorems, the following comments are 
intended to assist the reader in seeing how they are re- 
lated to one another and to the definitions given earlier 
in Sees. H] and Uni 

In Sec. Ill Al we generalize an earlier notion of types 
of quantum information to include general POVMs on a 
Hilbcrt space Tia for system a, by noting that the associ- 
ated probabilities are the same as those for a projective 
decomposition of the identity on a larger Hilbert space 
Ha, the Naimark extension, and a rank-1 POVM corre- 
sponds to an orthonormal basis of the extension. Various 
measures for different types of information are introduced 
and discussed in Sec. IIIII For uniformity of notation. 
Shannon entropies and related quantities are denoted by 
H{); e.g., H{Pa) is the missing information about type 
Pa, as determined by its probability distribution, when 
the quantum state is assumed known. For quantum en- 
tropies we use 50 for the von Neumann entropy, and 
Sk{), where K can be i? or T or Q for Renyi, Tsalhs, 
and quadratic entropies, respectively. 

We use the Holevo function x{Pa,b), or XK{Pa,b) for 
Sk, (|36p . as a measure of the amount of information of 
type Pa about system a which is present in system 6, 
along with the complementary quantity H{Pa\h), (|4ip. 
as a corresponding measure of the amount of informa- 
tion about Pa that is still missing given system b. While 
the analogy is not exact, x{Pa, b) is similar to Shannon's 
mutual information H{Pa -Qb), whereas H{Pa\b) resem- 
bles Shannon's conditional entropy H{Pa\Qb)- In par- 
ticular, H{Pa\b), like H{Pa\Qb), is nonnegative, so re- 
tains some of the intuition of the latter quantity, in con- 
trast to the quantum conditional entropy S{a\b), (jSOp . 
which can be of either sign. We use the term infor- 
mation bias for the difference between the amount of 
type Pa information about a in & and the amount in c, 
x{Pa, b) — x{Pa,c) = ^x{Pa', b, c), which can have cither 
sign. Similarly, we refer to AS{b,c) — S{pb) — S{pc) 
as the entropy bias, and add a subscript K when us- 
ing an alternative to the von Neumann entropy. Our 
most extensive results are for the von Neumann entropy 
and its associated information measures. However, in 
some cases, see Theorems [21 [31 [51 and [Til these results 
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also hold for a more general xk, and stating them in 
this form seems worthwhile, as for certain purposes these 
other measures could be useful. 

While the most natural and symmetrical, in terms of 
treating the different parts on the same footing, formu- 
lation of our results is in terms of a tripartite system, 
some of the more interesting and significant applications 
are to quantum channels and complementary channels. 
The relationship between the tripartite and the channel 
perspectives is worked out in some detail in Sec. HlBl 
and in Sec. IIII CI we relate the coherent information for 
a quantum channel to a corresponding tripartite entropy 
bias. In several theorems the channel counterparts of tri- 
partite results are stated separately, because while the 
formal results are in some sense the same, one's intuition 
about their significance can be different. 

Our first set of results are the equalities in Theorems [2] 
and [3] of Sec. llVi which apply for pure quantum states of 
bipartite and tripartite systems, respectively. The first 
says that the amount of information about a in 6 is inde- 
pendent of the type of information, provided the latter is 
a rank-1 POVM; this includes an orthonormal basis. The 
second says that the difference between the amount of in- 
formation concerning such a rank-1 POVM in b and in c is 
independent of the type considered, and equal to the cor- 
responding entropy bias. Equivalently, given two rank-1 
POVMs M and iV, the difference between the amount of 
M and TV information about a found in b is the same as 
the corresponding difference in c. While these results are 
limited to pure states, they are important for the proofs 
of many of the later results. They also extend from von 
Neumann to other quantum entropies, so they are stated 
in this more general form. 

Perhaps the simplest way of viewing the collection of 
inequalities that make up Sec. |V] is that the main the- 
orems are quantitative generalizations of all-or-nothing 
theorems which can be stated quite concisely for types 
of information associated with orthonormal bases v and 
If of system a. A central result of this paper is Theo- 
rem [5l and part (iii) of this theorem tells us that if the 
V information about a is perfectly present in &, which 
is to say H{v\b) = 0, then the mutually unbiased (MU) 
w type of information must be perfectly absent from c: 
H{w\c) = log da means that xiw^c) = 0. Part (ii) allows 
for bases that are not MU at the cost of a weaker bound 
on the H measures, while part (i) is not restricted to 
bases but applies to quite general types of information P 
and Q. The generalization to POVMs is, in turn, based 
on Lemma m which itself generalizes the Truncation the- 
orem 0: if the v = {vj} information is perfectly present 
in c then pat commutes with the Vj projectors. 

The connections of Theorem [5] to literature cntropic 
uncertainty relations are given in Sec. IVI Bl Broadly 
speaking we think that the addition of quantum side 
information to uncertainty relations 0, 13l not only 
strengthens certain bounds but also gives further concep- 
tual insight into the nature of complementarity, in that 
side information about complementary observables in dif- 



ferent locations (Sec. IV Bp must be constrained as well. 
We also note a recent experimental study [ssl ]. Further 
remarks on the significance of Theorem [5] can be found 
in the discussion that follows it in Sec. IV Bl 

Corollary [6] of Theorem [5] gives the corresponding re- 
sult for quantum channels, generalizing to partial infor- 
mation and to arbitrary POVMs or orthonormal bases 
the all-or-nothing theorem: if the v information is per- 
fectly present in (or transmitted by) the £ channel, any 
MU type of information w must be absent from (or de- 
stroyed by) the complementary channel J-. In addition. 
Corollary [7] of Theorem [5] provides strengthened informa- 
tion inequalities for a single system described by a mixed 
state. 

The idea behind Theorem [8] is encapsulated in the ob- 
servation that if the information about an orthonormal 
basis w of a is perfectly present in b, so H{w\b) = 0, and 
u and V are bases of a that are MU with respect to w (but 
not necessarily with respect to each other) then the u and 
V types are present in b in equal amounts. If, on the other 
hand the w information is less than perfectly present in 
6, this theorem provides quantitative bounds on the dif- 
ference between the u and v types of information in b. 
Similarly, the requirement that u and v be MU relative 
to w can be relaxed, and they can even be replaced with 
rank-1 POVMs, and w with a general POVM, see part (i) 
of the theorem, at the price of appropriately weakening 
the bounds that confine the differences. 

The results in Sec . IV D j provide quantitative generaliza- 
tions of conditions that ensure system c is completely un- 
corrclated to (or decoupled from) system a, Pac = Pa®Pc- 
Corollary [S] of Theorem [5] says that the correlations be- 
tween a and c are tightly upper-bounded if system 6 al- 
most perfectly contains all types of information about a, 
and gives the analogous result for complementary chan- 
nels £ and F. But Thcorem[Tn] stresses the importance of 
the presence of just two (sufficiently incompatible) types 
of information. That is, if b perfectly contains the infor- 
mation about two MUBs of a, then b contains all types 
of information about a, and c is completely uncorrelated 
to a; a generalization of this statement for the partial in- 
formation case is given in part (iii) of Theorem 1 101 Parts 
(ii) and (i) of this theorem respectively illustrate that this 
idea can be extended, at the price of weakened bounds, 
to any two orthonormal bases or to two POVMs in which 
at least one of the POVMs is rank-1. The relevance of 
inequalities like (|76p of Theorem 1101 where the presence 
of two types of information about a in 6 can be used to 
upper bound the information about a in c, to quantum 
cryptography was discussed in . 

The same sort of decoupling occurs when a single type 
of information about a associated with an orthonormal 
basis w is perfectly present in b and completely absent 
from c. Theorem [TT] contains this interesting result to- 
gether with certain quantitative generalizations, both 
when the type of information in question is only par- 
tially absent from c, and when it is not perfectly present 
in b. 
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B. Future outlook 

There are various ways in which the resuhs summa- 
rized above suggest problems which deserve further at- 
tention and research. One has to do with the difference 
between rank-1 and higher-rank POVMs, or orthonormal 
bases as against coarser projective decompositions of the 
identity. In a number of cases the results we have ob- 
tained for the former are distinctly stronger than for the 
latter, but the reason for this is not always clear. Since 
applications of quantum information theory to macro- 
scopic systems, in particular to problems of decoher- 
ence, lead rather naturally to coarse decompositions or 
POVMs, a good intuitive understanding in addition to 
formal expressions would be of value. A second item con- 
cerns the use of the r{P, Q) overlap measure for POVMs, 
or its riv^vj) counterpart for orthonormal bases, sec (|57p 
and (|6T|) . While this provides the basis of significant in- 
equalities in Theorem [5] and later, the fact that r{P,Q) 
requires one to maximize over all pairs of elements from 
the two POVMs hints that stronger results might well be 
possible were one to use a more refined perspective on 
how the POVMs are related to each other, or the sorts 
of information that they provide. 

While qualitative inequalities are certainly an advance 
over simple all-or-nothing results, it would be even bet- 
ter if one could express information tradeoffs in terms 
of equalities of the sort which could conceivably allow 
one to completely characterize how a quantum channel 
is related to its complementary channel using a (hope- 
fully small) number of parameters with a clear intuitive 
significance. The equalities in Theorem [31 as applied ei- 
ther to channels or, more generally, pure-state tripartite 
systems, hint that something like this might be possible, 
but thus far wc have not found it. 

Any advance in understanding tripartite systems raises 
an obvious question: what about systems with four (or 
more) parts? It is, of course, possible to study them 
by thinking of two of the parts as constituting a single 
object, and then applying results for tripartite systems. 
But there is probably some "residual" aspect of a system 
of four parts which cannot be captured in this way, just 
as there are residual aspects of tripartite systems which 
cannot be understood simply in terms of combining two 
of them so as to yield a bipartite system. We think that 
our results in this paper have helped to clarify some of 
this tripartite residual, and wc hope they provide hints 
on ways to deal with more complicated cases. 
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Appendix A: Proof of Lemma [T] 

Proof, (ii) The inequality x{Pa,b) < S{pb) obviously fol- 
lows from Now let c be a system that purifies 
Pab- Then by §8^, x{Pa,b) ^ x{PaM = S{pa) ~ 

To prove S{a : b) > x(Pa, &), as in Sec. [lElthink of Pa 
as a projective measurement Wae on system ae, where Wae 
is a coarse graining of some orthonormal basis (rank-1 
projectors) Wae- Let c purify pab such that pabce = Pabc® 
|eo) (cqI is a pure state. Then, S{a : b) — S{pae) + S{pb) — 

S{Pc) = X{Wae,bc) + x(Wae, b) - x(Wae, c) ^ xi^ae, b) ^ 

Xiwae, b) = x{Pa,b), by the Theorems in SecHVl by ([38]), 
and by □ 

Appendix B: Proof of Theorem [3l 

Proof, (i) For orthonormal basis w — {\wj)}, insert (j36p 
into (gll) to obtain 

XK{w,b) - xk{w,c) 

= Sliipb) - Sk{Pc) - '^P][SK{pbj) - SKiPcj)]- (Bl) 

j 

The final term vanishes, for the following reason. Write 
\VL) = \sj) in the form (fTU)l with \wj) replacing 

I flj ) , so from ^ the conditional density operators in (jBl|) 
are given by 

PjPbj = Trc(^|sj)(sj|), pjpcj = T:Yb{^Sj){sj\^. (B2) 

Since \sj) is a pure state the partial traces pbj and p^j 
have the same eigenvalues (determined by the Schmidt 
expansion coefficients of \sj)), except one may have more 
zeros than the other if db ^ dc. Since Sk[p) is a func- 
tion only of the nonzero (positive) eigenvalues of p, each 
term in the final sum in (|B1[) vanishes, and we are left 
with (in]). The generalization to rank-1 POVMs fol- 
lows by the equivalence of N to an orthonormal basis 
VA on T-La^ the Naimark extension of T-La as in Sec. Ill Al 
Since pAbc is a tripartite pure state, then /S.xk{N] b, c) = 
Axif(uA;&,c) = SxiPb) - Sk{Pc)- 

(ii) Equation (153]) follows from (|5ip by applying it to a 
channel ket jil) constructed from F by Q. Alternatively, 
it can be proven directly from (pQl) and (|48l) . obtaining 
an equation similar to ()B1|) . 

Axk{P;£,J') ^SKiTb/da) - SK{Tc/da) 
3 

where the final term vanishes again since pbj = 
'TrciVpajV'^] and Pcj = Trb[VpajV^ have the same (non- 
zero) spectrum, as the paj in (|40p are rank-1 operators. 
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Equations ([521) and ((541) follow immediately from ((5T|) 
and (|53p . respectively. □ 



Appendix C: Proof of Lemma [4] 

Proof, (i) 

i 

= -^(Pafc) - Tr[pab iog(^ n^p^bn,)] (CI) 

i 

k j I 

(C2) 



purifies pabc, we have: 

i7(Pa|c) = i7(n^|c) (C7) 

> i/(n^|cd) = ^(p^bii ^n^.p^bn^,) (C8) 

j 

^ S{T{pAb)\\Hj2^AjPAbIiAj)) (C9) 

j 

E^{Y^nA,PAbnAj)E^) (CIO) 

= SiEaPAbEaWY, EallAjEaPAbEallAjEa) (Cll) 

= ^(Pafc|l5I^ajPahPaj). (C12) 



= -S{pc) 


- Tr[^ n^p^bEfc log(;^ n.pabUj)] 

k 3 




H{P\b) ^ 


j 




Pab^k l0g(^ IIjPabIij)Ui] 


(C3) 


^ S{pc\ 




k,l^k 


j 






j 


= -S{Pc) 


+ SiY^^jPabll,) 

3 


(C4) 


^ Sipcl 


j 


= -S{Pc) 


+ H{Il)+J2PjS{pabj) 

j 


(C5) 


^ S{pc\ 


1 max Amax(fj) E Traii^jPac]) 


= HiU) 


x(n,c) = i/(n|c), 


(C6) 


= S{pc\ 


1 niaxAmax(^j)Pc) 



Note that the term with E^ in (jClOP disappeared be- 
cause it lies outside of the support of EaPAbEa- □ 



Appendix D: Proof of Theorem [5] 

Proof. First let us prove the single-POVM uncertainty 
relation as follows, defining Aniax(') to be the maximum 
eigenvalue. From Lemma [51 

(Dl) 
(D2) 
(D3) 
(D4) 
(D5) 



where pj = Tr(njPah) and PjPabj = ^jPab^j- The 
last term in ([CSP disappears because ^og{J2j^jPab^j) 
is block diagonal with respect to the IIj projectors, and 
then one takes an off-diagonal element of it. Step (|C5|) 
follows from Lemma [l] part (i). 

(ii) For clarity, we include the subscript a on the 
POVM Pa- Think of Pa = {Paj} as a projective mea- 
surement = {n^j} on an extended Hilbert space Ha 
(Naimark extension), with Ha a subspace and Ea the 
projector onto this subspace, and Paj = EaTlAjEa- The 
state pAb is the same as pab but now just embedded in 
a larger space, that is: pAb = EaPAbEa = Pab- Let E-^ 
be the projector onto the orthogonal complement of T-La, 
note EaPAbEa = 0, and let the channel be defined 
by T{p) = EapEa + E^pE^. Then if d is a system that 



= - log max Amax (f'j ) = - log max \\Pj\ 

3 3 

^ -logmax||ypJ"V^||oo- 

j.k 



(D6) 
(D7) 



We invoked ^ for step (|D2l). We used dH 
for step (|D3p . Ainax(^j")-^a ^ Pj which implies 

Tra[Amax(^'j)4T'ac] ^ TYa[PjTac], whcre Tac = 

y/PjPac^/Pj is a positive operator. We also used ([341) 
for step iDll, maxj A,nax(Pj)Ej^j ^ A,nax(^'j)^j 
where the Aj are positive operators. 

Now for the two-POVM uncertainty relation, consider 
the quantum channel [as in (142p ] £Q{pab) = X]fc \^k){&k\® 
T^T^aiQkPab) associatcd with the Q measurement, where 
{|efe)} is an orthonormal basis of an auxiliary system 
e. One can verify that £q is trace- preserving, and its 
complete positivity follows from the fact that {£q (E) 
^c){Pabc) Y.k |efc)(efc| TraiQkPabc) is a positive op- 
erator for any system c, where Xc is the identity chan- 
nel for c. Also, define Gjk — \fPjQk\fPj, and note 
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Gjk ^ Amax(Gjfc)/a, and r(P,Q) = max^^fc Amax(G_,fc). Then from Lemma [H 
I 

H{P\c) ^ SipabWY^PiP-bP,) > Si£Qipab)\\Y.^QiP^P-bPj)) (D8) 

j 

= SiY, \ei){ei\ ® TrJQ/Pafclll Y. \ek){ek\ ® TrJGjfe/P^pab/P^}) (D9) 

^ S{Y,\ei){ei\®T:Ta{QiPab}\\Y.\-aUGjk)\ek){ek\®T^ra{PjPab)) (DIO) 
I j,k 

^ SiY, \ei){ei\ ® TTa{QiPab}MP, Q)Ie ® Pb) (Dll) 

; 

= - log r(P, Q) - SiY, \ei){ei\® Tr„{Q,p,b}) - Tr[(^ \ei){ei\ TiaiQiPab}) log(/e ® Pb)] (D12) 
= - logr(P, Q) - H{Q) - qiSip'^i) + S{pb) = - log r(P, Q) - H{Q\b), (D13) 



where qi = Tr((3/pab) and = Tra{QiPab)- We in- 

voked (pS)) for step (jPSp . and we invoked (|34p for steps 
dmO]) and dnni). [For dDn]), A^ax(G,fe) r{P,Q) 
for each j,k, so replacing each A,nax(Gjfe) with r{P,Q) 
makes the overall operator larger.] Step (|D13p involves 
Lemma [1] part (i). □ 



Appendix E: Proof of Corollary [6] 

Proof, (i) Consider a channel ket \rt) on Habc with P = 
{Pj} and Q = {Qfe} two POVMs on a, and apply (USD 
and dnH) to |r2): 



fe) ^ H{P) - log[l/vAiPrP)] 
X(^, b) + x(g, c) ^ H(P) + H{Q) - log[l/r(P, Q)] 



(El) 



Now decompose = {la (8) V')|^') as in Sec. IIIBI and 
Fig. [Tl where system a' (of the same dimension as a) is 
introduced and fed into isometry V , and the state |$) ~ 
i^lVda) X]j |j)a'^b)a' IS maximally entangled, expanded 
here in the c omp utational bases on a and a' . By map- 
state duality [Tj] , think of | $) as an isometry V from Ha 
to "Ha', with = ajifi yy'i = j^, since da ~ da' ■ 
This means that P = {P,} = {VPjV^ and Q = {Qfc} = 
{VQuV^ arc POVMs on a'. If outcome P^ of P occurs 
on a, then element Pj will get fed into the isometry V ^ 

so x(P,6) = and likewise x{Q,c) = x{Q,J^), 

where £ and T arc the (complementary) channels to b 
and c, respectively, associated with isometry V. Also, 
since pa = la/da for a channel ket, the probability for 
Pj in (l6| given by pj = Tr (Pjpa) = Tr{Pj)/da reduces to 
the corresponding formula in (|40p . so H{P) = H{P) and 
likewise H{Q) = H{Q). Finally, show that r{P,Q) = 



r{P, Q) as follows: 

= Xm.A{Qk)'/'PAQk)'^'] = \m)'^\Qk)'^Yoo, (E2) 

where Aniax['] denotes the maximum eigenvalue and we 



used the fact that (Qj 



U/2 



V{Qk)^/^V'', which fol- 



lows from [y(Qfe)i/2yt]2 = yg^yt since (Qfc)^/^ and 
ViQkY^^V^ are positive operators. Thus from (jEl[) . 



x(P,f) =^i?(P)-log[l/VK^,^)] 

x(P,f ) + x(Q, -F) ^ i/(P) + i/(0) - log[l/r(P, Q)]. 

(E3) 

Since F is a one-to-one mapping of the set of POVMs on 
a to the set of POVMs on a', then (jE3jl must be true for 
all POVMs on a', and one can replace P and Q with P 
and Q in (|E3P for simplicity. 

(ii) Equation ([SS)) follows from ([F7|) since i?(w) = 
i/(w) log da from (go]). □ 



Appendix F: Proof of Corollary [7| 

Proof, (i) For ([70|) . let & be a system that purifies pa, 
apply dSll), and by Theorem El xiN,b) = S{pa). For 
(ffTjl . again let b purify pa, and apply ([58]) . System c is 
completely uncorrelated to a, so iJ(P|c) = H{P), and by 
Theorem 1 x{N,b) ^ Sipa). 

(ii) Equation ([721) follows from applied to MUBs 
X and y: 



H{x)+Hiy)^\og2 + S{pa). 



(Fl) 



Denote X,Y, and Z as the Pauli operators whose eigen- 
vectors are the x, y, and z bases. Consider a (possibly 
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mixed) state in the xy plane of the Bloch sphere: 

Pa = {la + aX + /3r)/2, (F2) 

for which H{z) — log 2, so (|7^ clearly holds for states of 
this form using (|Fip . Now consider varying pa along a 
vertical path within the Bloch sphere, from the state pa 
(in the xy plane) to a state p'a with some z component 
but with the same x and y components: 

p1 = (la +aX + pY + 7Z)/2, (F3) 

Denoting the relevant state with a subscript, note that 
^i^)pa — H{x)pi and H{y)p^ — H{y)p> remain con- 
stant, so to prove ([72|l for general states Pa, we just 
need to show that H{z) decreases more slowly than S{pa) 
along this path. This would be true if: 

H{z)p,^ - S{p'a) ^ H{z)p^ - S{pa) = log 2 - S{pa). (F4) 

Due to the isotropic nature of the Bloch sphere, it is suffi- 
cient to check that (jF4p holds for an initial state along the 
X-axis: Pa = [la + aX)/2 and p'a = {la + aX + 7Z)/2, 
since H{z) and S{pa) will vary in the same way along 
a vertical path regardless of an initial unitary rotation 
about z. But for such a state, S{pa) = H{x)p^ = H{x)p'^ , 
and lEll reduces to H{z)p'^ + H{x)pi^ > \og2 + S{p'a), 
which is (|Fip applied to MUBs z and x. Thus, vary- 
ing along a vertical path from a state in the xy plane 
to a state with some z-component keeps the values of 
H{x) and H{y) constant, while not decreasing the value 
of H{z) — S{pa), proving the result in general. □ 

Appendix G: Proof of Theorem [8] 

Proof. Let c be a system that purifies pab- Re- write ([55]) 

as 

X(M, c) H{P\b) + H{M) + logr(P, M), 

X{N, c) H{P\b) + H{N) + logr(F, N). (Gl) 

Taken together, these two inequalities give an upper 
bound on the difference |x(M, c) — x(-^>c)|. The dif- 
ference is at most the one computed by allowing the x 
quantity with the highest upper bound in JGIJ to reach 
its bound, and allowing the other x quantity to be zero. 
Thus, 

\x{M,c)-x{N,c)\^H{P\b)+ 

miix{H{M)+ log r{P,M),H{N) + log r(P, iV)}. (G2) 

By (|52p . substitute b for c on the left-hand-side. 

Rearranging (|G1[) to lower bound H{M\c) and H{N\c), 
and upper-bounding each respectively by H{M) and 
H{N), we can upper-bound their difference by the (max- 
imum) difference between the upper bound of one and 
the lower bound of the other: 

\H{M\c)^ H{N\c)\ s$ H{P\b)+ 

max{H{M) + log r{P, iV), H{N) + log r{P, A/)}. (G3) 



Again invoke ([52]) to switch from c to 6 and obtain (|73p . 

Now assuming u and v are MU with respect to w, (j74p 
follows from ([75]) by setting r{u,w) = r{v,w) = l/da, 
and by noting that H{u) ^ \og{da) and likewise for H{v), 
so that the max{} term in (jTSp is non-positive. 

Further specializing to the case oi H{w\b) — and 
V MU to w, then ([S^ implies H{v\c) = log da and 
x{v,c) = 0, and in turn that xk{v, c) = 0, because all xk 
measures are zero under the same conditions. Then by 
TheoremEl H{v\b) = H{v\c)- Ax{b,c) = log da + S{a\b), 
and Xk{v, b) = AxK{b, c) = SK{Pb) - SK{Pab)- □ 

Appendix H: Proof of Theorem 1101 

Proof, (i) First let c purify pab, and by Theorem [3l 
add the basis-invariant quantity Ax(c, 6) = H{N\b) — 
H{N\c) = S{pc) - S{pb) = S{a\b) to both sides of 
([58)) . setting Q = A^, to obtain (jTH]) . Now to prove 
([77]), let cd purify pab so that pabc = ^^d{pabcd) is 
a general (possibly mixed) state. Again by Theo- 
rem H [-S{a\b)] = x{M,b) -x{M,cd) ^ x{M,b) and 
[-S{a\b)] = H{M\cd) - H{M\b) ^ H{M\cd) H{M\c) 
by ip). 

(ii) The argument for complementary quantum chan- 
nels is the same. Add the basis-invariant quantity 
Ax(£', J") to (IMl) to obtain and obtain ^ using 

x{u,F)^\ogda- Ax{£,:f)- 

(iii) Equation ^ follows from S{a : 6)/2 ^ -S{a\b) ^ 
\ogda-[H{v\b)+H{w\b)\. For let cd purify pab, then 
H{v\b) + H{w\b) ^ \ogda + S{a\b) ^ S{pa) + S{a\b) = 
S{a:cd)^ S{a:c). □ 

Appendix I: Proof of Theorem 1111 

Proof, (i) First let us prove ([82| for pure pabc = 

S{a:c)^S{pa)-Ax{b,c) 

< H{N) - \og[l/ ^r{N,N)] - Ax(6,c) 

= H{N\b) + x{N, c) - \og[l/ ^r{N, N)l (II) 

where the first line follows from Theorem [31 and the sec- 
ond line is from (|70p . Now consider any pabc- Apply the 
just-proven result ([IT|l to pabcd- 

S{a : c) s; H{N\bd) + x{N, c) - \og[l/ ^r{N,N)]. (12) 

where pabcd is a purification of Pabc- Then, (|82p is ob- 
tained by noting that H{N\bd) ^ H{N\b) from ([55)1 . 

If information about a rank-1 POVM A^ is perfectly 
present in 6, this implies that the elements of N must 
be orthogonal and hence normalized, i.e. TV is some or- 
thonormal basis w = {Iwj)}. (The outputs pbj cannot 
all be orthogonal if the inputs are not orthogonal.) By 
the Truncation theorem of [2|, Pac ~ X)j Pj l^Pcj, 
i.e. c is at-most classically correlated to the w basis on 
a. Then the conditional density operators on c (ffcfc 
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occurring with probability qt) associated with POVM 
P = {Pk} are related to those associated with the w 
basis by qkack = Tr^ (Pfc/Oac) Y^j ^kjPjPcj, where 
Mkj = {wjlPklwj). Now use the concavity of the entropy 
Sk (all of our entropy functions have this property, see 
Sec. IIIIAp and Mkj = 1 to show that: 

Xk{P,c) = Sk{Pc) - ^QkSKicTck) 

k 

^ Sk{Pc) - y^^MkjPjSKjPcj) 

= Sk{Pc) - ^PjSk{Pc]) = xk{w,c). (13) 
j 

The remark that pac — Pa® Pc when all types are absent 
from c seems obvious, although it is rigorously proven in 
Theorem 1 of 

(ii) To prove ([84|) for pure states, note that the right- 
hand-side of ([82|) is an upper bound on x{L, c) and 



x(M, c) by (|44)) . so it must also upper-bound their dif- 
ference: 



\x{L,c)~x{M,c)\ s; x{N, c)+H{N\b)-\og[l/^r{N, N)]. 

(14) 

By (I52[) . b can replace c on the left- hand- side. 

In the case where information about N is perfectly 
present in b and absent from c, pac ~ Pa®Pc by part (i) of 
this theorem, and S{pb) = S{pac) = S{pa) + S{pc) by the 
additivity of S for product states. Thus by Theorem [31 
for any rank-1 POVM L, xiL, b) = xiL, b) - x{L, c) = 
S{pb) - S{pc) = S{pa). 

(iii) Equation (|86p follows immediately from x{'>^iJ^) ^ 
log da - Ax{S,J^) and x{v,S) ^ Ax{S,J^), where 
Ax{£,J-) = x{w,£) — xiw,T) is basis-invariant by The- 
orem [3) In the extreme case where the w type of in- 
formation is perfectly present in £ and absent from J^, 
Ax{£,J^) = logda, hence xi^i^) = and x(":^) = 
log da for all v. □ 
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